Q-Learning in Unstable Relationships

نویسنده

  • Ming Kang
چکیده

This study revisits Burdett, Imai and Wright's model of endogeneously unstable relationships [1]. In BIW, bilateral relationships are characterized as stable or unstable according to whether matched agents continue to search for new partners. Endogenous instability hinges on mutually reinforced beliefs about the search and acceptance behaviour of match partners in equilibrium. The model predicts multiple rational expectations (RE) equilibria in some instances and a unique RE equilibrium in others. Here, I investigate to what extent these strong equilibrium predictions are borne out when individuals are modelled as adaptive learning agents with no a priori knowledge of the environment. Attention is restricted to the version of the BIW model with just two match types and a break-up rule that precludes returning to one's previous partner once a new partner is found. Watkin's [9] Q-learning algorithm is used to simulate the adaptive learning process in the BIW environment. Consistent with the psychological concept of reinforcement, agents make subjective evaluations of behavioural alternatives based solely on direct experience and higher-valued alternatives are selected more often than lower-valued ones. Simulations of the Q-learning process point to cases in which the equilibrium conditions are not always binding; indeed, divergence from the equilibrium path is frequently observed under realistic parameter con gurations. The results suggest that adaptive learning processes like Q-learning can augment or diminish the presence of endogenous instability in actual relationships.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Constructing and Validating a Q-Matrix for Cognitive Diagnostic Analysis of a Reading Comprehension Test Battery

Of paramount importance in the study of cognitive diagnostic assessment (CDA) is the absence of tests developed for small-scale diagnostic purposes. Currently, much of the research carried out has been mainly on large-scale tests, e.g., TOEFL, MELAB, IELTS, etc. Even so, formative language assessment with a focus on informing instruction and engaging in identification of student’s strengths and...

متن کامل

Exploration Methods for Connectionist Q-learning in Bomberman

In this paper, we investigate which exploration method yields the best performance in the game Bomberman. In Bomberman the controlled agent has to kill opponents by placing bombs. The agent is represented by a multi-layer perceptron that learns to play the game with the use of Q-learning. We introduce two novel exploration strategies: Error-Driven-ε and Interval-Q, which base their explorative ...

متن کامل

Mini/Micro-Grid Adaptive Voltage and Frequency Stability Enhancement Using Q-learning Mechanism

This paper develops an adaptive control method for controlling frequency and voltage of an islanded mini/micro grid (M/µG) using reinforcement learning method. Reinforcement learning (RL) is one of the branches of the machine learning, which is the main solution method of Markov decision process (MDPs). Among the several solution methods of RL, the Q-learning method is used for solving RL in th...

متن کامل

Estimator Variance in Reinforcement Learning: Theoretical Problems and Practical Solutions

In reinforcement learning as in many on line search techniques a large number of estimation parameters e g Q value estimates for step Q learning are maintained and dynamically updated as in formation comes to hand during the learning process Excessive variance of these estimators can be problematic resulting in uneven or unstable learning or even making e ective learning impossible Estimator va...

متن کامل

Evaluating project’s completion time with Q-learning

Nowadays project management is a key component in introductory operations management. The educators and the researchers in these areas advocate representing a project as a network and applying the solution approaches for network models to them to assist project managers to monitor their completion. In this paper, we evaluated project’s completion time utilizing the Q-learning algorithm. So the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005